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Abstract 

The aim of this paper is to provide a probabilistic, but non-quantum, analysis 
of the Halting Problem. Our approach is to have the probability space extend over 
both space and time and to consider the probability that a random TV-bit program 
has halted by a random time. We postulate an a priori computable probability 
distribution on all possible runtimes and we prove that given an integer k > 0, we 
can effectively compute a time bound T such that the probability that an iV-bit 
program will eventually halt given that it has not halted by T is smaller than 2~ k . 

We also show that the set of halting programs (which is computably enumerable, 
but not computable) can be written as a disjoint union of a computable set and a 
set of effectively vanishing probability. 

Finally, we show that "long" runtimes are effectively rare. More formally, the 
set of times at which an iV-bit program can stop after the time 2 N+ cons t an t h as 
effectively zero density. 



1 Introduction 

The Halting Problem for Turing machines is to decide whether an arbitrary Turing 
machine M eventually halts on an arbitrary input x. As a Turing machine M can be 
coded by a finite string — say, code(M) — one can ask whether there is a Turing machine 
Mhait which, given code(M) and the input x, eventually stops and produces 1 if M(x) 
stops, and if M(x) does not stop. Turing's famous result states that this problem cannot 
be solved by any Turing machine, i.e. there is no such Mhait- Halting computations can be 
recognised by simply running them; the main difficulty is to detect non-halting programs. 
In what follows time is discrete. 



Since many real-world problems arising in the fields of compiler optimisation, au- 
tomatised software engineering, formal proof systems, and so forth are equivalent to the 
Halting Problem, there is a strong interest — not merely academic! — in understanding the 
problem better and in providing alternative solutions. 

There are two approaches we can take to calculating the probability that an iV-bit 
program will halt. The first approach, initiated by Chaitin [10] . is to have the probability 
space range only over programs; this is the approach taken when computing the Omega 
number, [U |2]. In that case, the probability is Prob^ = € | p halts}/2 7V . For a 
self-delimiting machine, Probjy goes to zero when N tends to infinity, since it becomes 
more and more likely that any given iV-bit string is an extension of a shorter halting 
program. For a universal non-self-delimiting Turing machine, the probability is always 
nonzero for large enough N: after some point, the universal non-self-delimiting Turing 
machine will simulate a total Turing machine (one that halts on all inputs), so some 
fixed proportion of the space will always contribute. The probability in this case is 
uncomputable, machine- dependent; in general, 1 is the best computable upper bound 
one can find. In this approach it matters only whether a program halts or not; the time 
at which a halting program stops is irrelevant. 

Our approach is to have the probability space extend over both space and time, and 
to consider the probability that a random iV-bit program — which hasn't stopped by some 
given time — will halt by a random later time. In this approach, the stopping time of 
a halting program is paramount. The problem is that there is no uniform distribution 
on the integers, so we must choose some kind of distribution on times as well. Any 
distribution we choose must have that most long times are rare, so in the limit, which 
distribution we choose doesn't matter very much. 

The new approach was proposed by Calude and Pavlov [7] (see also [I]) where a 
mathematical quantum "device" was constructed to probabilistically solve the Halting 
Problem. The procedure has two steps: a) based on the length of the program and 
an a priori admissible error 2~ k , a finite time T is effectively computed, b) a quantum 
"device", designed to work on a randomly chosen test- vector is run for the time T; if the 
device produces a click, then the program halts; otherwise, the program probably does 
not halt, with probability higher than 1 — 2~ h . This result uses an unconventional model 
of quantum computing, an infinite dimensional Hilbert space. This quantum proposal 
has been discussed in Ziegler [T7] . 

It is natural to ask whether the quantum mechanics machinery is essential for ob- 
taining the result. In this paper we discuss a method to "de-quantize" the algorithm. 
We have been motivated by some recent approximate solutions to the Halting Problem 
obtained by Kohler, Schindelhauer and M. Ziegler [T3] and experimental work [H [T^jFI 
Different approaches were proposed by Hamkins and Miasnikov [15], and D'Abramo [12J. 

Our working hypothesis, crucial for this approach, is to postulate an a priori com- 
putable probability distribution on all possible runtimes. Consequently, the probability 
space is the product of the space of programs of fixed length (or of all possible lengths), 
where programs are uniformly distributed, and the time space, which is discrete and has 

1 For example, Langdon and Poli 14J suggest that, for a specific universal machine that they describe, 
about TV -1 / 2 programs of length N halt. 
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an a priori computable probability distribution. In this context we show that given an 
integer k > 0, we can effectively compute a time bound T such that the probability on 
the product space that an iV-bit program will eventually halt given that it not stopped 
by T is smaller than 2~ k . This phenomenon is similar to the one described for proofs in 
formal axiomatic systems [5]. 

We also show that for every integer k > 0, the set of halting programs (which is 
computably enumerable, but not computable) can be written as a disjoint union of a 
computable set and a set of probability effectively smaller than 2~ k . 

Of course, an important question is to what extent the postulated hypothesis is 
acceptable/realistic. The next part of the paper deals with this question offering an 
argument in favor of the hypothesis. First we note that for any computable probability 
distribution most long times are effectively rare, so in the limit they all have the same 
behavior irrespective of the choice of the distribution. Our second argument is based 
on the common properties of the times when programs may stop. Our proof consists of 
three parts: a) the exact time a program stops is algorithmically not too complicated; 
it is (algorithmically) nonrandom because most programs either stop 'quickly' or never 
halt, b) an iV-bit program which hasn't stopped by time 2 N+cons ^ a,n ^ cannot halt at a 
later random time, c) since nonrandom times are (effectively) rare, the density of times 
an iV-bit program can stop vanishes. 

We will start by examining a particular case in detail, the behavior of all programs of 
length 3 for a certain Turing machine. This case study will describe various possibilities 
of halting/ non-halting programs, the difference between a program stopping exactly at 
a time and a program stopping by some time, as well as the corresponding probabilities 
for each such event. 

Finally, we show some differences between the halting probabilities for different types 
of universal machines. 

Some comments will be made about the "practicality" of the results presented in this 
paper: can we use them to approximately solve any mathematical problems? 



2 A case study 

We consider all programs of length N = 3 for a simple Turing machine M whose domain 
is the finite set {000,010,011,100,110,111}. The halting "history" of these programs, 
presented in Table 1, shows the times at which the programs in the domain of M halt. 
The program M(pi) halts at time t — 1, so it is indicated by an "h" on the row corre- 
sponding to pi and time t — 1; the program M(p 4 ) hasn't halted at time t = 5, so on 
the row corresponding to p^ and time t = 4 there is a blank. Finally, programs which 
haven't stopped by time t = 17 never stop. 
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Table 1: Halting "history" for 3-bit programs. 



Here are a few miscellaneous facts we can derive from Table 1: 

• the program M{pi) halts exactly at time t — 1, 

• the set of 3-bit programs which halt exactly at time t = 1 consists of {pi,P7}, 
so the 'probability' that a randomly chosen 3-bit program halts at time t — 1 is 
#{3-bit programs halting at time l}/#{3-bit programs} = #{pl,p7}/8 = 2/8 = 
1/4, 

• the set of 3-bit programs which halt by time t = 8 consists of {pi.p^Pi}, so the 
'probability' that a randomly picked 3-bit program halts by time t = 8 is #{3-bit 
programs halting by time 8}/#{3-bit programs} = #{pl,p4,p7}/8 = 3/8, 

• the 'probability' that a random 3-bit program eventually stops is #{3-bit programs 
that halt}/#{3-bit programs} = 6/8, 

• the program M(p 4 ) hasn't stopped by time t = 5, but stops at time t = 8, 

• the 'probability' that a 3-bit program does not stop by time t = 5 and that it 
eventually halts is #{3-bit programs that eventually halt that have not stopped 
by time t = 5}/#{3-bit programs} = #{p3,p4,p5,p8}/8 = 4/8 = 1/2, 

• the 'probability' that a 3-bit program eventually stops given that it has not halted 
by time t — 5 is #{3-bit programs that eventually halt that have not stopped by 
time t = 5}/#{3-bit programs that have not halted by time t = 5} = 4/(8 — 2) = 
2/3, 

• the 'probability' that a 3-bit program halts at time t — 8 given that it has not halted 
by time t — 5 is #{3-bit programs that halt by time t — 8 but not by time t = 
5}/#{3-bit programs that have not halted by time t — 5} = 1/6. 

We can express the above facts in a bit more formal way as follows. We fix a universal 
Turing machine U (see section 3 for a definition) and a pair (N,T), where N represents 
the length of the program and T is the "time-window", i.e. the interval {1,2, . . . ,T}, 
where the computational time is observed. The probability space is thus 

Space N , T = Z N x {1,2,...,T}. 



4 



Both programs and times are assumed to be uniformly distributed. For A C SpaceN,r 
we define Piob N>T {A) to be • 2~ N ■ T~ l . 

Define 

An,t = {GM) G SpaceN,T I ^ (p) stops exactly at time t}, 

and 

-Sjv,t = {(p,t) £ Space^^T I ^(p) stops by time t}. 

Fact 1. We have: Prober (A n ,t) < ^ and Prob NtT {BN,T) < 1- 
Proof. It is easy to see that < 2 N , consequently, 

Prober (Ajv.t) = 2jV -T < = ^> Prober (£jv, t ) < < = 1. 

□ 

Comment. The inequality Probjv^-BA/vr) < 1 does not seem to be very informa- 
tive. However, for all N, one can construct a universal Turing machine Un such that 
Prober = 1; Un cannot be self-delimiting (see, for a definition, section 4). There is 
no universal Turing machine U such that Prober = 1, for all N, so can we do better 
than stated in Fact (Tp 

More precisely, we are interested in the following problem: 

We are given a universal Turing machine U and a randomly chosen program 
p of length N that we know not to stop by time t. Can we effectively evaluate 
the 'probability 7 that U(p) eventually stops? 

An obvious way to proceed is the following. Simply, run in parallel all programs of 
length N till the time Tn = max{t p | \p\ = N, U(p) halts} = min{t | for all \p\ — N,t p < 
£}, where t p is the exact time U{p) halts (if indeed it stops). In other words, get the 
analogue of the Table 1 for U and N, and then calculate directly all probabilities. This 
method, as simple as it may look, isn't very useful, since the time Tn is not computable 
because of the undecidability of the Halting Problem. 

Can we overcome this serious difficulty? 

3 Notation 

All strings are binary and the set of strings is denoted by E*. The length of the string x is 
denoted by The logarithms are binary too. Let N = {1, 2, . . .} and let bin : N — > £* 
be the computable bijection which associates to every n > 1 its binary expansion without 
the leading 1, 
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We will work with Turing machines M which process strings into strings. The domain 
of M, dom(M), is the set of strings on which M halts (is defined). The natural complexity 
[9] of the string x G S* (with respect to M) is Vm( x ) — minjn > 1 | M(bin(n)) = 
x}. The Invariance Theorem [2] has the following form: we can effectively construct a 
machine U (called universal) such that for every machine M, there is a constant e > 
(depending on U and M) such that Vu(x) < e ■ Vjy(a;), for all strings x. For example, 
if U(0 l lx) = Mi(x) (where (Mi) is an effective enumeration of all Turing machines), 
then Vu(x) < (2 i+l + 1) • V Mi (x), because i lbin(m) = bin(2 m +L lo s( m )J + m), for all 
m > 1. In what follows we will fix a universal Turing machine U and we will write V 
instead of Vy. There are some advantages in working with the complexity V instead 
of the classical complexity K (see [2]); for example, for every N > 0, the inequality 
#{x G X* : V(x) < iV} < N is obvious; a better example appears in [S] where V is a 
more natural measure to investigate the relation between incompleteness and uncertainty. 



4 Halting according to a computable time distribu- 
tion 

We postulate an a priori computable probability distribution on all possible runtimes. 
Consequently, the probability space is the product of the space of programs — either taken 
to be all programs of a fixed length, where programs are uniformly distributed, or to be 
all programs of all possible lengths, where the distribution depends on the length — and 
the time space, which is discrete and has an a priori computable probability distribution. 

In what follows we randomly choose a time i from according to a probability distri- 
bution pii) which effectively converges to 1, that is, there exists a computable function 
B such that for every n > B(k), 

n 

|l-^p(z)|<2- fc . 

i=l 

How long does it take for an iV-bit program p to run without halting on U to conclude 
that the probability that U(p) eventually stops is less than 2 _fc ? 

It is not difficult to see that the probability that an iV-bit program which hasn't 
stopped on U by time (which can be effectively computed) will eventually halt is not 
larger than ^2 i>t p(i), which effectively converges to 0, that is, there is a computable 
function b(k) such that for n > b(k), ^2 i>n p(i) < 2~ k - 
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The probability distribution p(i) may or may not be related to the computational 
runtime of a program for U. Here is an example of a probability distribution which 
effectively converges to 1 and relates the observer time to the computational runtime. 
This probability distribution is reminiscent of Chaitin's halting probability [2J, but in 
contrast, is computable. 

The idea is to define the distribution at moment i to be 2~ % divided by the exact 
time it takes Z7(bin(i)) to halt, or to be if C/(bin(i)) does not halt. Recall that t p is the 
exact time U(p) halts (or t p = oo when U(p) does not halt). 

First we define the number 



in(i)- 



i>l 



It is clear that < TV < 1- Moreover, TV is computable. Indeed, we construct 
an algorithm computing, for every positive integer n, the nth digit of Tu- The proof 
is simple: only the terms 2 — l /^bin(i) f° r which [/(bin(z)) does not halt, i.e. t-^- m ^ = oo, 
produce 'false' information because at every finite step of the computation they appear to 
be non-zero when, in fact, they are zero! The solution is to run all non-stopping programs 
C/(bin(i)) for enough time such that their cumulative contribution is too small to affect 
the nth digit of T^: indeed, if n > 2, and t^- m ^ = 1, for i > n, then Y^hLn 2-iAbin(i) < 
2 -n 

So, Tu induces a natural probability distribution on the runtime: to i we associated 

n—i 



*bin(i) ' 
The probability space is 

Space N) { p (i)y = T, N x {1, 2, . . .}, 

where iV-bit programs are assumed to be uniformly distributed, and we choose at random 
a runtime from distribution ([IT) . 

Theorem 2. Assume that U(p) has not stopped by time T > k — [logTyJ. Then, the 
probability (according to the distribution (J^)) that U(p) eventually halts is smaller than 

2 -k 



Proof. It is seen that 



for T > A; — [log Tu\ ■ The bound is computable because Tu is computable. 

□ 



Of course, instead of 2 V^bin(i) we can * a ^ e r */' : bin(i)' wnere J2i>i r i < °°! effectively. 
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We now consider the probability space to be 

Space {p{i)} = S* x {1,2, . . .}, 

where iV-bit programs are assumed to be uniformly distributed, and the runtime is chosen 
at random from the computable probability distribution {p(i)}. 

Theorem 3. Assume that U and Space^^ have been fixed. For every integer k > 0, 
the set of halting programs for U can be written as a disjoint union of a computable set 
and a set of probability effectively smaller than 2~ k . 

Proof. Let b be a computable function such that for n > b(k) we have ^2 i>n p(i) < '■ 
The set of halting programs for U can be written as a disjoint union of the computable 
set {(p,t p ) | t p < 2 6 ( fe +lpl+ 2 )} and the set {(p,t p ) | 2 b ( fc+ l p l+ 2 ) < t p < oo}. The last set has 
probability effectively less than 

oo oo oo 

E E Pn^E 2 "^ -2 - 2 "*" 1 - 

N=l n=b(k+N+2) N=l 

□ 

Comment. A stronger (in the sense that the computable set is even polynomially de- 
cidable), but machine-dependent, decomposition theorem for the set of halting programs 
was proved in [T5] . 

5 How long does it take for a halting program to 
stop? 

The common wisdom says that it is possible to write short programs which stop after 
a very long time. However, it is less obvious that there are only a few such programs; 
these programs are "exceptions". 

Working with self-delimiting Turing machines, Chaitin [TTJ has given the following 
estimation of the complexity of the runtime of a program which eventually halts: there 
is a constant c such that if U(bm(i)) halts in time t, then 

V(bin(t)) < 2 |bin(4)l • c < i ■ c. (2) 

Here t is the first time C/(bin(i)) halts0 The above relation puts a limit on the complexity 
of the time t a program bin(z), that eventually halts on U, has to run before it stops; this 
translates into a limit on the time t because only finitely many strings have complexity 

3 Chaitin used program-size complexity. 

4 Of course, if E/(bin(z)) halts in time t, it stops also in time t' > t, but only finitely many t' satisfy 
the inequality ((2]). For the reader more familiar with the program-size complexity H — with repsect to a 
universal self-delimiting Turing machine [2] — the inequality ([2]) corresponds to H(bin(t)) < |bin(i)| + c. 
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bounded by a constant. In view of the bound depends only upon the length of the 
program; the program itself (i.e. bin(z)) does not matter. 

Because lim^oo V(bin(t)) = oo, there are only finitely many integers t satisfying 
the inequality ([2]). That is, there exists a critical value T critica i (depending upon U and 
|bin(z)|) such that if for each t < T critica i, U(bin(i)) does not stop in time t, then U (bin(i)) 
never halts. In other words, 

if U(bin(i)) does not stop in time T criticah then [/(bin(z)) never halts. 

So, what prevents us from running the computation [/(bin(i)) for the time T CT iu C ai 
and deciding whether it halts or not? Obviously, the uncomput ability of T critica i. Neither 
the natural complexity V nor any other size complexity, like K or H, is computable (see 
[2]). Obviously, there are large integers t with small complexity V(bin(t)), but they 
cannot be effectively "separated" because we cannot effectively compute a bound b{k) 
such that V(bin(t)) > k whenever t > b(k). 

The above analysis suggests that a program that has not stopped after running for 
a long time has smaller and smaller chances to eventually stop. The bound (T5]) is not 
computable. Still, can we "extract information" from the inequality ([2]) to derive a 
computable probabilistic description of this phenomenon? 

Without loss of generality, we assume that the universal Turing machine U has a 
built-in counting instruction. Based on this, there is an effective transformation which 
for each program p produces a new program time(p) such that there is a constant c > 
(depending upon U) for which the following three conditions are satisfied: 

1. U(p) halts iff U(time(p)) halts, 

2. \time(p)\ < \p\ + c, 

3. if U(p) halts, then it halts at the step t p = bin -1 (U (time(p))) . 

Intuitively, time(p) either calculates the number of steps t p till U(p) halts and prints 
bin(tp), or, if U(p) is not defined, never halts. The constant c can be taken to be less 
than or equal to 2, as the counting instruction is used only once, and we need one more 
instruction to print its value; however, we don't need to print the value U(p). 

We continue with a proof of the inequality (J2J) for an arbitrary universal Turing 
machine. 

Theorem 4. Assume that U (p) stops at time t p , exactly. Then, 

V(bin(t p )) < 2l p l +c+1 . (3) 

Proof. First we note that for every program p of length at most N, bin _1 (p) < 2 N+1 . 
Indeed, \p\ = |bin(bin _1 (p))| < iV implies 

2 |p| < bin^Go) < 2 |p|+1 < 2 N+1 . (4) 
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Since U(p) = £/(bin(bin 1 (p))) we have: 

V{U(p)) = min{i > 1 : [/(bin(z)) = U(p)} < bin" 1 ^), 

hence 

V(bin(t p )) = V{U{time{p))) < bin -1 (time(p)) < 2 |p|+c+1 , 
because \time(p)\ < \p\ + c and (HI). □ 



6 Can a program stop at an algorithmically random 
time? 

In this section we prove that no program of length N > 2 which has not stopped by time 
2 2N+2c+1 will stop at an algorithmically random time. Consequently, since algorithmically 
nonrandom times are (effectively) rare, there are only a few times an iV-bit program can 
stop in a suitably large range. As a consequence, the set of times at which an iV-bit 
program can stop after the time 2 Ar+cons ^ an ^ has effectively zero density. 

A binary string x is "algorithmically random" if V(x) > 2' :c '/|x|. Most binary strings 
of a given length n are algorithmically random because they have high density: G 
X* : \x\ = n, V(x) > 2 n /n} ■ 2~ n > 1 — 1/n which tends to 1 when n — > oojfl 

A time t will be called "algorithmically random" if bin(t) is algorithmically random. 

Theorem 5. Assume that an N-bit program p has not stopped on U by time 2 2N+2c+1 , 
where N > 2 and c comes from Theorem ^ Then, U(p) cannot exactly stop at any 
algorithmically random time t > 2 2N+2c+1 . 

Proof. First we prove that for every n > 4 and t > 2 2n_1 , we have: 

2 |bin(*)| >2 ».|bin(t)|. (5) 

Indeed, the real function f(x) = 2 x /x is strictly increasing for x > 2 and tends to 
infinity when x — > oo. Let m = |bin(t)|. As 2 2n ~ 1 < t < 2 m+1 , it follows that m > 2n — 1, 
hence 2 m /m > 2 2n ~~ 1 /(2n — 1) > 2 n . The inequality is true for every |bin(£)| > 2n — 1, 
that is, for every t > 2 2n ~ l . 

Next we take n = iV + c+ lin([5]) and we prove that every algorithmically random 
time t > 2 2N+2c+1 , N > 2, does not satisfy the inequality (j3J). Consequently, no program 
of length iV which has not stopped by time 2 2N+2c+1 will stop at an algorithmically 
random time. 

□ 

A time t is called "exponential stopping time" if there is a program p which stops on 
U exactly at t = t p > 2 2 l p l +2c+1 . How large is the set of exponential stopping times? To 
answer this question we first need a technical result. 

5 In the language of program-size complexity, x is "algorithmically random" if H(x) > \x\ — log(|x|). 
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Lemma 6. Let m > 3, s > 1. T/ien 



1 s 



< 



m + i m + s — 1 

i=0 

Proof. Let us denote by a;™ the left-hand side of the inequality below. It is easy to see 
that 

2 s - 1 m 2 S+1 2 
s+1 2 s + l -1 s m + s + 1 ~ 2 m+s+1 

Next we prove by induction (on s) the inequality in the statement of the lemma. For 
s = 1 we have x™ = 1/m + 2/(m + 1) < 5/m. Assume that a;™ < 5/(m + s — 1). Then: 

/rr x~ 2 5 2 5 

xT+i < — H < -7 r H < 

2 m + s + 1 2(m + s — 1) m + s + 1 m + s 

□ 

The density of times in the set {1, 2, ... , iV} satisfying the property P is the ratio 
| 1 < ^ < N,P(t)}/N. A property P of times has "effective zero density" if the 
density of times satisfying the property P effectively converges to zero, that is, there is a 
computable function B(k) such that for every iV > B(k), the density of times satisfying 
the property P is smaller than 2~ k . 

Theorem 7. For every length N, we can effectively compute a threshold time 6n (which 
depends on U and N ) such that if a program of length N runs for time 9n without halting, 
then the density of times greater than 9^ at which the program can stop has effective zero 
density. More precisely, if an N-bit program runs for time T > max{0jv,2 2+5 ' 2 }, then 
the density of times at which the program can stop is less than 2~ k . 

Proof. We choose the bound On = 2 2N+2c+1 + 1, where c comes from (jHJ). Let T > On and 
put m = 2N + 2c + 1, and s = |_l°g(^ + 1)J ~ m - Then, using Theorem [5] and Lemma [61 
we have: 

o|bin(t)| 

# { 2 m < t < T | V(bin(t)) > 



T-2 m + l " ~ ~ 1 v wy ~ |bin(t)| 



> 



> 1 - 
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+ 1 
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i=0 
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2 |bin(t)| 
|bin(t)| 

m + i 
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2 m + 1 ^ 
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T -2 m + 1 ^ m + i 
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> 1 , 

m + s — 1 

consequently, the density of algorithmically random times effectively converges to 1: 

#{t\t>9 N ,t<T,t^ t v , for all p with \p\ = N\ 
hm — - - 

T^oo T - 2 m + 1 

9l bin WI 

> lim rr, 1,, • # <( 2 m < * < T | V(bin(t)) > 



r-woT-^ + l I wy _ |bin(t)| 

so the density of times greater than 9^ at which an iV-bit program can stop effectively 
converges to zero. □ 

The next result states that "almost any" time is not an exponential stopping time. 

Corollary 8. The set of exponential stopping times has effective zero density. 

Proof. It is seen that 

{t\t = t p , for some p with t > 2 2|p|+2c+1 } 

C |J U | t > 2 2 l p l +2c - 1 , \p\ = N, V(bin(t)) < — 



JV>1 



C <t\t> 2 2c+1 , V(bin(t)) < 



|bin(t)| 

2 |bin(t)| " 



|bin(t)| 

which has effectively zero density in view of Theorem [71 



□ 



7 Halting probabilities for different universal ma- 
chines 

In this section we show a significant difference between the halting probability of a pro- 
gram of a given length for a universal Turing machine and for a universal self-delimiting 
Turing machine: in the first case the probability is always positive, while in the second 
case the probability tends to zero when the length tends to infinity. 

The probability that an arbitrary string of length iV belongs to A C X* is Prob^A) = 
#(4 n E N ) ■ 2~ N , where E N is the set of JV-bit strings. 

If U is a universal Turing machine, then the probability that an iV-bit program p 
halts on U is ProbAr(dom(L r )). 
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Fact 9. Let U be a universal Turing machine. Then, limjv->oo Probjv(dorn(tT)) > 0. 

Proof. We consider the universal Turing machine U(0 l x) = T,(x), described in section 3. 
If N is sufficiently large, then there exists an i < N such that Tj is a total function, 
i.e. Ti is defined on each input, so #(dom([7) fl £ ) > 2 N ~ l ~ 1 . For all such iV's, 
ProbAr(dom(?7)) > 2 _i_1 > 0. The result extends to any universal Turing machine 
because of universality. 

□ 

A convergent machine V is a Turing machine such that its ( number is finite: 

Cv = l f n < 00 ' 

bin(n)edom(V) 

see [9]. The fl number of V is fly — Ew=i Prob/v(dom(V)). Because £y < oo if and 
only if fly < oo, see [9], we get: 

Fact 10. Let V be a convergent machine. Then, liniAr^oo Probjv(dom(V)) = 0. 

Recall that a self-delimiting Turing machine V is a machine with a prefix-free domain. 
For such a machine, fly < 1, hence we have: 

Corollary 11. Let V be a universal self- delimiting Turing machine. Then 
liniAr^oo Prob7v(dom(V)) = 0. 

The probability that an iV-bit program never stops on a convergent Turing machine 
tends to one when N tends to infinity; this is not the case for a universal Turing machine. 

8 Final comments 

We studied the halting probability using a new approach, namely we considered the 
probability space extend over both space and time, and the probability that a ran- 
dom iV-bit program will halt by a random later time given that it hasn't stopped by 
some threshhold time. We postulated an a priori computable probability distribution on 
all possible runtimes. Consequently, the probability space is the product of the space 
of programs — either taken to be all programs of a fixed length, where programs are 
uniformly distributed, or to be all programs of all possible lengths, where the distribu- 
tion depends on the length — and the time space, which is discrete and has an a priori 
computable probability distribution. We proved that given an integer k > 0, we can 
effectively compute a time bound T such that the probability that an iV-bit program 
will eventually halt, given that it has not stopped by time T, is smaller than 2 _fc . 

We also proved that the set of halting programs (which is computably enumerable, 
but not computable) can be written as a disjoint union of a computable set and a set of 
probability effectively smaller than any fixed bound. 
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Finally we showed that runtimes much longer than the lengths of their respective 
halting programs are (effectively) rare. More formally, the set of times at which an 
A^-bit program can stop after the time 2 Ar+cons ^ an ^ has effectively zero density. 

Can we use this type of analysis for developing a probabilistic approach for proving 
theorems? 

The class of problems which can be treated in this way are the "finitely refutable 
conjectures". A conjecture is finitely refutable if verifying a finite number of instances 
suffices to disprove it [6]. The method seems simple: we choose a natural universal 
Turing machine U and to each such conjecture C we can effectively associate a program 
He such that C is true iff U(Uc) never halts. Running U(Uc) for a time longer than the 
threshold will produce a good evidence of the likelihood validity of the conjecture. For 
example, it has been shown [3J that for a natural U, the length of the program validating 
the Riemann Hypothesis is 7,780 bits, while for the Goldbach's Conjecture the length of 
the program is 3,484 bits. 

Of course, the choice of the probability distribution on the runtime is paramount. 
Further, there are at least two types of problems with this approach. 

First, the choice of the universal machine is essential. Pick a universal U and let p be 
a program such that U(p) never stops if and only if a fixed finitely refutable conjecture 
(say, the Riemann Hypothesis) is true. Define W such that W(l) = U(p) (tests the 
conjecture), and W(0x) = U(x). The Turing machine W is clearly universal, but working 
with W "artificially" makes the threshold very small. Going in the opposite direction, 
we can write our simulator program in such a way that it takes a huge number of steps 
to simulate the machine — say Ackermann's function of the runtime given by the original 
machine. Then the new runtime will be very long, while the program is very short. Or 
we could choose very powerful instructions so that even a ridiculously long program on 
the original machine would have a very short runtime on the new one. 

The moral is that if we want to have some real idea about the probability that a 
conjecture has a counter-example, we should choose a simulator and program that are 
"honest": they should not overcharge or undercharge for each time-step advancing the 
computation. This phenomenon is very similar to the fact that the complexity of a single 
string cannot be independent of the universal machine; here, the probability of halting 
cannot be independent of the machine whose steps we are counting. 

Secondly, the threshold T will increase exponentially with the length of the program 
He (of course, the length depends upon the chosen U). For most interesting conjectures 
the length is greater than 100, so it is hopeless to imagine that these computations can 
be effectively carried out (see [H] for an analysis of the maximum speed of dynami- 
cal evolution). It is an open question whether another type of computation (possibly, 
quantum) can be used to speed-up the initial run of the program. 
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